Optimizing inductive queries in frequent itemsets mining

نویسندگان

  • Marco Botta
  • Roberto Esposito
  • Arianna Gallo
چکیده

Let Q = {Q1, . . . , Qn} be a set of past queries and let R = {R1, . . . , Rn} be their results. Moreover, let Q0 be a query newly submitted to the system and let R0 be its result. The task of optimizing the extraction of R0 using the knowledge provided by Q and R have been faced following two distinct approaches. In the first approach we search for a query Qi ∈ Q, such that R0 ⊆ Ri (in such a situation we say that Qi dominates Q0) and try to exploit this result in order to simplify the mining of R0. We notice that in such a situation the system must face two problems:

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Constraint-Based Discovery and Inductive Queries: Application to Association Rule Mining

Recently inductive databases (IDBs) have been proposed to afford the problem of knowledge discovery from huge databases. Querying these databases needs for primitives to: (1) select, manipulate and query data, (2) select, manipulate and query “interesting” patterns (i.e., those patterns that satisfy certain constraints), and (3) cross over patterns and data (e.g., selecting the data in which so...

متن کامل

Boolean Formulas and Frequent Sets

We consider the problem of how one can estimate the support of Boolean queries given a collection of frequent itemsets. We describe an algorithm that truncates the inclusion-exclusion sum to include only the frequencies of known itemsets, give a bound for its performance on disjunctions of attributes that is smaller than the previously known bound, and show that this bound is in fact achievable...

متن کامل

Inductive Databases and Multiple Uses of Frequent Itemsets: The cInQ Approach

Inductive databases (IDBs) have been proposed to afford the problem of knowledge discovery from huge databases. With an IDB the user/analyst performs a set of very different operations on data using a query language, powerful enough to perform all the required elaborations, such as data preprocessing, pattern discovery and pattern postprocessing. We present a synthetic view on important concept...

متن کامل

Optimization of association rule mining queries

Levelwise algorithms (e.g., the Apriori algorithm) have been proved e ective for association rule mining from sparse data. However, in many practical applications, the computation turns to be intractable for the usergiven frequency threshold and the lack of focus leads to huge collections of frequent itemsets. To tackle these problems, two promising issues have been investigated during the last...

متن کامل

A Performance Study of Three Disk-based Structures for Indexing and Querying Frequent Itemsets

Frequent itemset mining is an important problem in the data mining area. Extensive efforts have been devoted to developing efficient algorithms for mining frequent itemsets. However, not much attention is paid on managing the large collection of frequent itemsets produced by these algorithms for subsequent analysis and for user exploration. In this paper, we study three structures for indexing ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004